OS Support for a Commodity Database on PC clusters - Distributed Devices vs. Distributed File Systems

نویسندگان

  • Felix Rauch
  • Thomas Stricker
چکیده

In this paper we attempt to parallelise a commodity database for OLAP on a cluster of commodity PCs by using a distributed high-performance storage subsystem. By parallelising the underlying storage architecture we eliminate the need to make any changes to the database software. We look at two options that differ in their complexity and features: Distributed devices and distributed file systems. The former aggregates several single disks within the cluster into a RAID device across the network. The latter offers all the features of a real file system at the price of a considerably increased complexity. We configured a Linux version of ORACLE to run on various distributed devices or distributed file systems, respectively, and ran a TPC-D benchmark on our cluster of commodity PCs interconnected by a Gigabit Ethernet. While distributed devices achieve at least the performance of local disks, they offer the benefit of using all surplus storage in a cluster. The distributed file systems seem to run into performance problems due to their increased complexity. We explain the experimental results with an analytic model of the cluster architecture and include a comparison of the same workload on an architecture that distributes the TPC-D queries at a higher level (and not just the underlying storage system). We conclude with suggestions for higher performances in future clusters of commodity PCs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation of PC Cluster System with Memory Mapped File by Commodity OS

In this study, we propose a method to implement PC cluster systems on a non-open source of the commodity OS. For this purpose, we also introduce an implementation of a Distributed Shared memory utilizing a Distributed file system and a Memory Mapped File without modifying OS. We have designed and implemented a single DFS, by using a high-speed network interface, DIMMnet2 which has a large capac...

متن کامل

HopsFS: Scaling Hierarchical File System Metadata Using NewSQL Databases

Recent improvements in both the performance and scalability of shared-nothing, transactional, in-memory NewSQL databases have reopened the research question of whether distributed metadata for hierarchical file systems can be managed using commodity databases. In this paper, we introduce HopsFS, a next generation distribution of the Hadoop Distributed File System (HDFS) that replaces HDFS’ sing...

متن کامل

Syzygy: Native PC Cluster VR

The Syzygy software library consists of tools for programming VR applications on PC clusters. Since the PC cluster environment presents application development constraints, it is impossible to simultaneously optimize for efficiency, flexibility, and portability between the single-computer and cluster cases. Consequently Syzygy includes two application frameworks: a distributed scene graph frame...

متن کامل

Large Scale Distributed File System Survey

Cloud computing, one type of distributed systems, is becoming very popular. It has demonstrated easily processing very large data over commodity clusters is possible with correct programming model and infrastructure. One critical issue here lies in the file system (FS) [1]. In this report, I reviewed a number of outstanding distributed file systems (DFS).

متن کامل

A simple installation and administration tool for the large-scaled PC cluster system: DCAST

In this paper, a new setup/administration tool for PC cluster systems is proposed. Recently, in the high performance computing eld, PC cluster systems are becoming popular. PC cluster systems consist of PCs connected via a network and are used for parallel and distributed computing. PC cluster systems achieve a good cost to performance ratio by using commodity hardware to construct the cluster....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005